Genomic Phylostratigraphy
   HOME

TheInfoList



OR:

Genomic phylostratigraphy is a novel genetic
statistical Statistics (from German: ''Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industria ...
method developed in order to date the origin of specific genes by looking at its
homologs A couple of homologous chromosomes, or homologs, are a set of one maternal and one paternal chromosome that pair up with each other inside a cell during fertilization. Homologs have the same genes in the same loci where they provide points alon ...
across
species In biology, a species is the basic unit of classification and a taxonomic rank of an organism, as well as a unit of biodiversity. A species is often defined as the largest group of organisms in which any two individuals of the appropriate s ...
. It was first developed by
Ruđer Bošković Institute The Ruđer Bošković Institute (RBI; hr, Institut Ruđer Bošković, , IRB) is a research institute located in the Šalata neighborhood of Zagreb, Croatia, founded in 1950, which studies the sciences. Description It is the largest Croatian resea ...
in
Zagreb Zagreb ( , , , ) is the capital (political), capital and List of cities and towns in Croatia#List of cities and towns, largest city of Croatia. It is in the Northern Croatia, northwest of the country, along the Sava river, at the southern slop ...
,
Croatia , image_flag = Flag of Croatia.svg , image_coat = Coat of arms of Croatia.svg , anthem = "Lijepa naša domovino"("Our Beautiful Homeland") , image_map = , map_caption = , capit ...
. The system links genes to their founder gene, allowing us to then determine their age. This could in turn help us better understand many evolutionary processes.


Method

This technique relies on the assumption that the diversity of the
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ge ...
is not only due to
gene duplication Gene duplication (or chromosomal duplication or gene amplification) is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene. ...
s but also to continuous frequent de novo gene births. These genes (called "founder genes") would form from non-genic
DNA sequences A nucleic acid sequence is a succession of bases signified by a series of a set of five different letters that indicate the order of nucleotides forming alleles within a DNA (using GACT) or RNA (GACU) molecule. By convention, sequences are usua ...
, as well as from changes in reading frame (or other ways of arising from within existing genes), or even from very rapid evolution of the
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
that would modify the sequence beyond recognition. These new genes would at first have high evolutionary rates that would then slow down with time, allowing us to recognise their lineage in their descendants. The founder genes can then be put in a specific phylostratum. The
phylostratum Phylostratum is a set of genes from an organism that coalesce to founder genes having common phylogenetic In biology, phylogenetics (; from Greek φυλή/ φῦλον [] "tribe, clan, race", and wikt:γενετικός, γενετικός [] ...
is represented as the clade that includes all the genes that derive from the same founder gene, signifying that this gene was formed in the common ancestor of this clade (e.g. Arthropoda, Mammalia, Metazoa, etc.). Positioning these founder genes and their descendants on different phylostrata can allow us to age them. This can then be used to analyse the origin of certain functions of proteins and developmental processes on a macroevolutionary scale, by observing connections between certain genes as well. The original method for genomic phylostratigraphy involves the use of a
BLAST Blast or The Blast may refer to: * Explosion, a rapid increase in volume and release of energy in an extreme manner *Detonation, an exothermic front accelerating through a medium that eventually drives a shock front Film * ''Blast'' (1997 film) ...
sequence similarity search with a 10−3 E-value cut off. The genes deemed similar enough in sequence are gathered and the
clade A clade (), also known as a monophyletic group or natural group, is a group of organisms that are monophyletic – that is, composed of a common ancestor and all its lineal descendants – on a phylogenetic tree. Rather than the English term, ...
englobing all the
taxa In biology, a taxon (back-formation from ''taxonomy''; plural taxa) is a group of one or more populations of an organism or organisms seen by taxonomists to form a unit. Although neither is required, a taxon is usually known by a particular nam ...
represented by those genes is determined. This clade then becomes the phylostratum of these genes. By determining the
common ancestor Common descent is a concept in evolutionary biology applicable when one species is the ancestor of two or more species later in time. All living beings are in fact descendants of a unique ancestor commonly referred to as the last universal comm ...
of this clade, we can hence give an age to the founder gene and all its descendants. Applying the process on a genome-wide scale can then allow us to detect patterns of founder genes births and infer the role of certain genes involved in clade-specific developmental processes and physiological pathways, and the origin of those traits. The developers of the method gave in the original paper an example how to exploit this system in practice using ''
Drosophila ''Drosophila'' () is a genus of flies, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or (less frequently) pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many species ...
''. They gathered 13,000 genes for which they determined the founder genes, regrouping them in their respective phylostrata. They also segregated the families of genes depending on whether they were mainly expressed in either of the three
germ layer A germ layer is a primary layer of cells that forms during embryonic development. The three germ layers in vertebrates are particularly pronounced; however, all eumetazoans (animals that are sister taxa to the sponges) produce two or three pr ...
s (
endoderm Endoderm is the innermost of the three primary germ layers in the very early embryo. The other two layers are the ectoderm (outside layer) and mesoderm (middle layer). Cells migrating inward along the archenteron form the inner layer of the gast ...
,
mesoderm The mesoderm is the middle layer of the three germ layers that develops during gastrulation in the very early development of the embryo of most animals. The outer layer is the ectoderm, and the inner layer is the endoderm.Langman's Medical E ...
,
ectoderm The ectoderm is one of the three primary germ layers formed in early embryonic development. It is the outermost layer, and is superficial to the mesoderm (the middle layer) and endoderm (the innermost layer). It emerges and originates from t ...
). By studying the frequencies of expression of genes in those different phylostrata, they were able to hypothetically pinpoint the possible original formation of those germ layers to specific periods and ancestral organisms in evolutionary history. Since its invention, genomic phylostratigraphy has been regularly used by this research team as well as others, notably in an attempt to determine the origin of cancer genes, seemingly showing a strong link between a peak in the formation of cancer genes and the transition to
multicellular organism A multicellular organism is an organism that consists of more than one cell, in contrast to unicellular organism. All species of animals, land plants and most fungi are multicellular, as are many algae, whereas a few organisms are partially uni- ...
s, a connection which had been previously hypothesised and is hence further supported by phylostratigraphy. As its use has grown, the method has been assessed and enhanced on multiple occasions, and programs that run it automatically and more efficiently have been developed. One of the most prominent uses of genomic phylostratigraphy has been in inferring the correlation between phylogeny and developmental processes (often called the phylogeny-ontogeny correlation). Using genomic phylostratigraphy, to this day scientists have found a significant phylogeny-ontogeny correlation in animals, plants, fungi, and even bacterial biofilms.


Criticism

Albeit it being now used frequently by the scientific community, genomic phylostratigraphy has also received some criticism for being too inaccurate for its measurements to be trustworthy. First of all, according to some authors precision lacks in the assumptions. It is erroneous to assume for example that all species beyond the organism of focus share the same protein evolutionary rate, which isn't true as it varies depending on
cell cycle The cell cycle, or cell-division cycle, is the series of events that take place in a cell that cause it to divide into two daughter cells. These events include the duplication of its DNA (DNA replication) and some of its organelles, and subs ...
speeds, leading to problems in setting the limits of BLAST error to englobe all proteins originated from the same founder gene. Another point is that the BLAST search assumes that protein evolutionary rates is constant at all its sites, which is also false. Lastly, it could be said that the model does not account correctly for gene duplications, as well as gene losses: the changes in evolutionary rates caused by gene duplications due to new functional changes would increase BLAST error rates, and gene loss in taxa distant to the one studied could lead to great underestimations in the calculated gene age and phylostratum of founder genes compared to their true values. However, rather than demanding to simply abandon the method, critics have been trying to work at refining it from its original state, by introducing other potential mathematical formulas or sequence searching tools, although the Ruđer Bošković Institute has replied to such criticism claiming their original approach was valid and did not need to be extensively revised. This debate is also included as part of the wider discussion on the importance of de novo gene births in creating genetic diversity, in which genomic phylostratigraphy supports that they do hold a strong effect, in a way that it can only be widely accepted or refuted once the latter dilemma has been resolved.


References

{{Reflist Genomics